Goto

Collaborating Authors

 Hanover


Appendix A V ariational Paragraph Embedder A.1 Selection of substitution rate p

Neural Information Processing Systems

Figure 4: Impact of the proportion of injected noise for learning Paragraph Em-beddings on XSum dataset. (Figure 4). The results of the ablation study are presented in Table 5. Embedder in providing clean and denoised reconstructions. In general, it has been observed that generations progress in a coarse-to-fine manner. The early time step, which is close to 1, tends to be less fluent and generic. This was the nicest stay we have ever had. Turtle Bay was a great resort. This was the nicest stay we have ever had.


Appendix A V ariational Paragraph Embedder A.1 Selection of substitution rate p

Neural Information Processing Systems

Figure 4: Impact of the proportion of injected noise for learning Paragraph Em-beddings on XSum dataset. (Figure 4). The results of the ablation study are presented in Table 5. Embedder in providing clean and denoised reconstructions. In general, it has been observed that generations progress in a coarse-to-fine manner. The early time step, which is close to 1, tends to be less fluent and generic. This was the nicest stay we have ever had. Turtle Bay was a great resort. This was the nicest stay we have ever had.


MR-UIE: Multi-Perspective Reasoning with Reinforcement Learning for Universal Information Extraction

Li, Zhongqiu, Wang, Shiquan, Fang, Ruiyu, Bao, Mengjiao, Wu, Zhenhe, Song, Shuangyong, Li, Yongxiang, He, Zhongjiang

arXiv.org Artificial Intelligence

Information extraction (IE) is a fundamental task in natural language processing (NLP), which encompasses a wide range of subtasks such as Named Entity Recognition (NER), Relation Extraction (RE), and Event Extraction (EE) [1-4]. Traditionally, these tasks have been addressed by specialized models trained in task-specific datasets. However, the fragmentation of tasks and schemas has hindered the development of generalizable and scalable IE tasks. To address this limitation, recent research has focused on universal information extraction (UIE), which aims to model all IE tasks within a universal framework. A seminal work in this direction is proposed by Lu et al., which introduced a structured generation paradigm that encodes diverse IE tasks into a common semantic representation[5]. Building on this, InstructUIE[6] extended the idea by incorporating multi-task instruction tuning, enabling models to generalize across tasks via natural language instructions. With the emergence of powerful LLMs[7-11], significant advancements have been made across long-standing NLP tasks such as text classification[12-16], intent recognition[17, 18], entity linking[19-22], and beyond. Inspired by their robust performance and adaptability, researchers have explored their potential for information extraction through prompting and in-context learning learning[23, 24]. For example, CodeIE demonstrated that code generation models can serve as strong few-shot IE extractors by using structured code-like commands[25].


TableZoomer: A Collaborative Agent Framework for Large-scale Table Question Answering

Xiong, Sishi, He, Ziyang, He, Zhongjiang, Zhao, Yu, Pan, Changzai, Zhang, Jie, Wu, Zhenhe, Song, Shuangyong, Li, Yongxiang

arXiv.org Artificial Intelligence

While large language models (LLMs) have shown promise in the table question answering (TQA) task through prompt engineering, they face challenges in industrial applications, including structural heterogeneity, difficulties in target data localization, and bottlenecks in complex reasoning. To address these limitations, this paper presents TableZoomer, a novel LLM-powered, programming-based agent framework. It introduces three key innovations: (1) replacing the original fully verbalized table with structured table schema to bridge the semantic gap and reduce computational complexity; (2) a query-aware table zooming mechanism that dynamically generates sub-table schema through column selection and entity linking, significantly improving target localization efficiency; and (3) a Program-of-Thoughts (PoT) strategy that transforms queries into executable code to mitigate numerical hallucination. Additionally, we integrate the reasoning workflow with the ReAct paradigm to enable iterative reasoning. Extensive experiments demonstrate that our framework maintains the usability advantages while substantially enhancing performance and scalability across tables of varying scales. When implemented with the Qwen3-8B-Instruct LLM, TableZoomer achieves accuracy improvements of 19.34% and 25% over conventional PoT methods on the large-scale DataBench dataset and the small-scale Fact Checking task of TableBench dataset, respectively.


TableReasoner: Advancing Table Reasoning Framework with Large Language Models

Xiong, Sishi, Wang, Dakai, Zhao, Yu, Zhang, Jie, Pan, Changzai, He, Haowei, Li, Xiangyu, Chang, Wenhan, He, Zhongjiang, Song, Shuangyong, Li, Yongxiang

arXiv.org Artificial Intelligence

The paper presents our system developed for table question answering (TQA). TQA tasks face challenges due to the characteristics of real-world tabular data, such as large size, incomplete column semantics, and entity ambiguity. To address these issues, we propose a large language model (LLM)-powered and programming-based table reasoning framework, named TableReasoner. It models a table using the schema that combines structural and semantic representations, enabling holistic understanding and efficient processing of large tables. We design a multi-step schema linking plan to derive a focused table schema that retains only query-relevant information, eliminating ambiguity and alleviating hallucinations. This focused table schema provides precise and sufficient table details for query refinement and programming. Furthermore, we integrate the reasoning workflow into an iterative thinking architecture, allowing incremental cycles of thinking, reasoning and reflection. Our system achieves first place in both subtasks of SemEval-2025 Task 8.


An Efficient Transport-Based Dissimilarity Measure for Time Series Classification under Warping Distortions

Aldroubi, Akram, Martín, Rocío Díaz, Medri, Ivan, Pas, Kristofor E., Rohde, Gustavo K., Rubaiyat, Abu Hasnat Mohammad

arXiv.org Machine Learning

Time Series Classification (TSC) is an important problem with numerous applications in science and technology. Dissimilarity-based approaches, such as Dynamic Time Warping (DTW), are classical methods for distinguishing time series when time deformations are confounding information. In this paper, starting from a deformation-based model for signal classes we define a problem statement for time series classification problem. We show that, under theoretically ideal conditions, a continuous version of classic 1NN-DTW method can solve the stated problem, even when only one training sample is available. In addition, we propose an alternative dissimilarity measure based on Optimal Transport and show that it can also solve the aforementioned problem statement at a significantly reduced computational cost. Finally, we demonstrate the application of the newly proposed approach in simulated and real time series classification data, showing the efficacy of the method.


Towards Type Agnostic Cyber Defense Agents

Galinkin, Erick, Pountrourakis, Emmanouil, Mancoridis, Spiros

arXiv.org Artificial Intelligence

With computing now ubiquitous across government, industry, and education, cybersecurity has become a critical component for every organization on the planet. Due to this ubiquity of computing, cyber threats have continued to grow year over year, leading to labor shortages and a skills gap in cybersecurity. As a result, many cybersecurity product vendors and security organizations have looked to artificial intelligence to shore up their defenses. This work considers how to characterize attackers and defenders in one approach to the automation of cyber defense -- the application of reinforcement learning. Specifically, we characterize the types of attackers and defenders in the sense of Bayesian games and, using reinforcement learning, derive empirical findings about how to best train agents that defend against multiple types of attackers.


Jump Starting Bandits with LLM-Generated Prior Knowledge

Alamdari, Parand A., Cao, Yanshuai, Wilson, Kevin H.

arXiv.org Artificial Intelligence

We present substantial evidence demonstrating the benefits of integrating Large Language Models (LLMs) with a Contextual Multi-Armed Bandit framework. Contextual bandits have been widely used in recommendation systems to generate personalized suggestions based on user-specific contexts. We show that LLMs, pre-trained on extensive corpora rich in human knowledge and preferences, can simulate human behaviours well enough to jump-start contextual multi-armed bandits to reduce online learning regret. We propose an initialization algorithm for contextual bandits by prompting LLMs to produce a pre-training dataset of approximate human preferences for the bandit. This significantly reduces online learning regret and data-gathering costs for training such models. Our approach is validated empirically through two sets of experiments with different bandit setups: one which utilizes LLMs to serve as an oracle and a real-world experiment utilizing data from a conjoint survey experiment.


A Mathematical Framework for the Problem of Security for Cognition in Neurotechnology

Bagley, Bryce Allen

arXiv.org Artificial Intelligence

The rapid advancement in neurotechnology in recent years has created an emerging critical intersection between neurotechnology and security. Implantable devices, non-invasive monitoring, and non-invasive therapies all carry with them the prospect of violating the privacy and autonomy of individuals' cognition. A growing number of scientists and physicians have made calls to address this issue, but applied efforts have been relatively limited. A major barrier hampering scientific and engineering efforts to address Cognitive Security is the lack of a clear means of describing and analyzing relevant problems. In this paper we develop Cognitive Security, a mathematical framework which enables such description and analysis by drawing on methods and results from multiple fields. We demonstrate certain statistical properties which have significant implications for Cognitive Security, and then present descriptions of the algorithmic problems faced by attackers attempting to violate privacy and autonomy, and defenders attempting to obstruct such attempts.


Use of Graph Neural Networks in Aiding Defensive Cyber Operations

Mitra, Shaswata, Chakraborty, Trisha, Neupane, Subash, Piplai, Aritran, Mittal, Sudip

arXiv.org Artificial Intelligence

In an increasingly interconnected world, where information is the lifeblood of modern society, regular cyber-attacks sabotage the confidentiality, integrity, and availability of digital systems and information. Additionally, cyber-attacks differ depending on the objective and evolve rapidly to disguise defensive systems. However, a typical cyber-attack demonstrates a series of stages from attack initiation to final resolution, called an attack life cycle. These diverse characteristics and the relentless evolution of cyber attacks have led cyber defense to adopt modern approaches like Machine Learning to bolster defensive measures and break the attack life cycle. Among the adopted ML approaches, Graph Neural Networks have emerged as a promising approach for enhancing the effectiveness of defensive measures due to their ability to process and learn from heterogeneous cyber threat data. In this paper, we look into the application of GNNs in aiding to break each stage of one of the most renowned attack life cycles, the Lockheed Martin Cyber Kill Chain. We address each phase of CKC and discuss how GNNs contribute to preparing and preventing an attack from a defensive standpoint. Furthermore, We also discuss open research areas and further improvement scopes.